On the Impact of Fast Failure Detectors on Real-Time Fault-Tolerant Systems

نویسندگان

  • Marcos K. Aguilera
  • Gérard Le Lann
  • Sam Toueg
چکیده

We investigate whether fast failure detectors can be useful — and if so by how much — in the design of real-time fault-tolerant systems. Specifically, we show how fast failure detectors can speed up consensus and fault-tolerant broadcasts, by providing fast algorithms and deriving some matching lower bounds, for synchronous systems with crashes. These results show that a fast failure detector service (implemented using specialized hardware or expedited message delivery) can be an important tool in the design of real-time mission-critical systems.

منابع مشابه

Fast Asynchronous Uniform Consensus in Real-Time Distributed Systems

We investigate whether asynchronous computational models and asynchronous algorithms can be considered for designing real-time distributed fault-tolerant systems. A priori, the lack of bounded finite delays is antagonistic with timeliness requirements. We show how to circumvent this apparent contradiction, via the principle of “late binding” of a solution to some (partially) synchronous model. ...

متن کامل

Definition and properties of accrual failure detectors : an overview

Ensuring fast and accurate failure detection is a fundamental issue for building efficient fault-tolerant distributed systems. In an effort to make fault-tolerant applications easier to implement, we are trying to provide failure detection as a generic Internet service, similar to what was done very successfully with NTP (network time protocol) for clock synchronization. To do so, we must revis...

متن کامل

Failure Detectors: implementation issues and impact on consensus performance

Due to their nature, distributed systems are vulnerable to failures of some of their parts. Conversely, distribution also provides a way to increase the fault tolerance of the overall system. However, achieving fault tolerance is not a simple problem and requires complex techniques. An agreement problem known as the problem of consensus is at the heart of most problems encountered during the de...

متن کامل

Fault tolerant nano-satellite attitude control by adaptive modified nonsingular fast terminal control

In this paper, an adaptive fault tolerant nonlinear control is proposed for attitude tracking problem of satellite with three magnetorquers and one reaction wheel in the presence of inertia uncertainties, external disturbances, and actuator faults. Firstly, sliding surface variable is chosen based on avoiding the singularity of control signal and guaranteeing the convergence of attitude trackin...

متن کامل

Impact of a Failure Detection Mechanism on the Performance of Consensus

The paper considers a consensus algorithm for an asynchronous system augmented with failure detectors, and analyze the impact on its termination time of various implementations of failure detectors. This study shows that the design of fault-tolerant distributed algorithms in the asynchronous system model augmented with failure detectors is orthogonal to implementing the actual failure detectors...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

متن کامل
عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002